{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Evaluation with JustCause\n",
    "\n",
    "In this notebook, we examplify how to use JustCause in order to evaluate methods using reference datasets. For simplicity, we only use one dataset, but show how evaluation works with multiple methods. Both standard causal methods implemented in the framework as well as custom methods. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Custom First\n",
    "The goal of the JustCause framework is to be a modular and flexible facilitator of causal evaluation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The autoreload extension is already loaded. To reload it, use:\n",
      "  %reload_ext autoreload\n"
     ]
    }
   ],
   "source": [
    "%load_ext autoreload\n",
    "\n",
    "%autoreload 2\n",
    "\n",
    "# Loading all required packages \n",
    "import itertools\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from sklearn.model_selection import train_test_split\n",
    "from justcause.data import Col\n",
    "from justcause.data.sets import load_ihdp\n",
    "from justcause.metrics import pehe_score, mean_absolute\n",
    "from justcause.evaluation import calc_scores, summarize_scores\n",
    "\n",
    "from sklearn.linear_model import LinearRegression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup data and methods you want to evaluate\n",
    "Let's say we wanted to compare a S-Learner with propensity weighting, based on a propensity estimate of our choice. Thus, we cannot simply use the predefined SLearner from `justcause.learners`, but have to provide our own adaption, which first estimates propensities and uses these for fitting an adjusted model. \n",
    "\n",
    "By providing a \"blackbox\" method like below, you can choose to do whatever you want inside. For example, you can replace your predictions available factual outcomes, estimate the propensity in different ways or even use a true propensity, in case of a generated dataset, where it is available. You can also resort to out-of-sample prediction, where no information about treatment is provided to the method. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from justcause.learners import SLearner\n",
    "from justcause.learners.propensity import estimate_propensities\n",
    "\n",
    "# Get the first 100 replications\n",
    "replications = load_ihdp(select_rep=np.arange(100))\n",
    "metrics = [pehe_score, mean_absolute]\n",
    "\n",
    "train_size = 0.8\n",
    "random_state = 42\n",
    "\n",
    "def weighted_slearner(train, test):\n",
    "    \"\"\"\n",
    "    Custom method that takes 'train' and 'test' CausalFrames (see causal_frames.ipynb)\n",
    "    and returns ITE predictions for both after training on 'train'. \n",
    "    \n",
    "    Implement your own method in a similar fashion to evaluate them within the framework!\n",
    "    \"\"\"\n",
    "    train_X, train_t, train_y = train.np.X, train.np.t, train.np.y\n",
    "    test_X, test_t, test_y = test.np.X, test.np.t, test.np.y\n",
    "    \n",
    "    \n",
    "    # Get calibrated propensity estimates\n",
    "    p = estimate_propensities(train_X, train_t)\n",
    "\n",
    "    # Make sure the supplied learner is able to use `sample_weights` in the fit() method\n",
    "    slearner = SLearner(LinearRegression())\n",
    "    \n",
    "    # Weight with inverse probability of treatment (inverse propensity)\n",
    "    slearner.fit(train_X, train_t, train_y, weights=1/p)\n",
    "    return (\n",
    "        slearner.predict_ite(train_X, train_t, train_y),\n",
    "        slearner.predict_ite(test_X, test_t, test_y)\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example Evaluation Loop\n",
    "Now given a callable like `weighted_slearner` we can evaluate that method using multiple metrics on the given replications. \n",
    "The result dataframe then contains two rows with the summarized scores over all replications for train and test separately. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "results_df = list()\n",
    "    \n",
    "test_scores = list()\n",
    "train_scores = list()\n",
    "\n",
    "for rep in replications:\n",
    "\n",
    "    train, test = train_test_split(\n",
    "        rep, train_size=train_size, random_state=random_state\n",
    "    )\n",
    "\n",
    "    # REPLACE this with the function you implemented and want to evaluate\n",
    "    train_ite, test_ite = weighted_slearner(train, test)\n",
    "\n",
    "    # Calculate the scores and append them to a dataframe\n",
    "    train_scores.append(calc_scores(train[Col.ite], train_ite, metrics))\n",
    "    test_scores.append(calc_scores(test[Col.ite], test_ite, metrics))\n",
    "\n",
    "# Summarize the scores and save them in a dataframe\n",
    "train_result, test_result = summarize_scores(train_scores), summarize_scores(test_scores)\n",
    "train_result.update({'method': 'weighted_slearner', 'train': True})\n",
    "test_result.update({'method': 'weighted_slearner', 'train': False})\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pehe_score-mean</th>\n",
       "      <th>pehe_score-median</th>\n",
       "      <th>pehe_score-std</th>\n",
       "      <th>mean_absolute-mean</th>\n",
       "      <th>mean_absolute-median</th>\n",
       "      <th>mean_absolute-std</th>\n",
       "      <th>method</th>\n",
       "      <th>train</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.592356</td>\n",
       "      <td>2.569472</td>\n",
       "      <td>8.248291</td>\n",
       "      <td>0.369939</td>\n",
       "      <td>0.212427</td>\n",
       "      <td>0.524395</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.493401</td>\n",
       "      <td>2.589651</td>\n",
       "      <td>7.903174</td>\n",
       "      <td>0.655602</td>\n",
       "      <td>0.287201</td>\n",
       "      <td>0.941941</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   pehe_score-mean  pehe_score-median  pehe_score-std  mean_absolute-mean  \\\n",
       "0         5.592356           2.569472        8.248291            0.369939   \n",
       "1         5.493401           2.589651        7.903174            0.655602   \n",
       "\n",
       "   mean_absolute-median  mean_absolute-std             method  train  \n",
       "0              0.212427           0.524395  weighted_slearner   True  \n",
       "1              0.287201           0.941941  weighted_slearner  False  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame([train_result, test_result])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now in this case, using `justcause` has hardly any advantages, because only one dataset and one method is used. You might as well just implement all the evaluation manually. However, this can simply be expanded to more methods by looping over the callables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def basic_slearner(train, test):\n",
    "    \"\"\" \"\"\"\n",
    "    train_X, train_t, train_y = train.np.X, train.np.t, train.np.y\n",
    "    test_X, test_t, test_y = test.np.X, test.np.t, test.np.y\n",
    "\n",
    "    slearner = SLearner(LinearRegression())\n",
    "    slearner.fit(train_X, train_t, train_y)\n",
    "    return (\n",
    "        slearner.predict_ite(train_X, train_t, train_y),\n",
    "        slearner.predict_ite(test_X, test_t, test_y)\n",
    "    )\n",
    "\n",
    "methods = [basic_slearner, weighted_slearner]\n",
    "\n",
    "results = list()\n",
    "\n",
    "for method in methods:\n",
    "    \n",
    "    test_scores = list()\n",
    "    train_scores = list()\n",
    "\n",
    "    for rep in replications:\n",
    "\n",
    "        train, test = train_test_split(\n",
    "            rep, train_size=train_size, random_state=random_state\n",
    "        )\n",
    "\n",
    "        # REPLACE this with the function you implemented and want to evaluate\n",
    "        train_ite, test_ite = method(train, test)\n",
    "\n",
    "        # Calculate the scores and append them to a dataframe\n",
    "        test_scores.append(calc_scores(test[Col.ite], test_ite, metrics))\n",
    "        train_scores.append(calc_scores(train[Col.ite], train_ite, metrics))\n",
    "\n",
    "    # Summarize the scores and save them in a dataframe\n",
    "    train_result, test_result = summarize_scores(train_scores), summarize_scores(test_scores)\n",
    "    train_result.update({'method': method.__name__, 'train': True})\n",
    "    test_result.update({'method': method.__name__, 'train': False})\n",
    "\n",
    "    results.append(train_result)\n",
    "    results.append(test_result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pehe_score-mean</th>\n",
       "      <th>pehe_score-median</th>\n",
       "      <th>pehe_score-std</th>\n",
       "      <th>mean_absolute-mean</th>\n",
       "      <th>mean_absolute-median</th>\n",
       "      <th>mean_absolute-std</th>\n",
       "      <th>method</th>\n",
       "      <th>train</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.633660</td>\n",
       "      <td>2.623297</td>\n",
       "      <td>8.362125</td>\n",
       "      <td>0.732443</td>\n",
       "      <td>0.238185</td>\n",
       "      <td>1.493276</td>\n",
       "      <td>basic_slearner</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.625971</td>\n",
       "      <td>2.635993</td>\n",
       "      <td>8.213626</td>\n",
       "      <td>1.292668</td>\n",
       "      <td>0.396246</td>\n",
       "      <td>2.474603</td>\n",
       "      <td>basic_slearner</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.592356</td>\n",
       "      <td>2.569472</td>\n",
       "      <td>8.248291</td>\n",
       "      <td>0.369939</td>\n",
       "      <td>0.212427</td>\n",
       "      <td>0.524395</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>5.493401</td>\n",
       "      <td>2.589651</td>\n",
       "      <td>7.903174</td>\n",
       "      <td>0.655602</td>\n",
       "      <td>0.287201</td>\n",
       "      <td>0.941941</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   pehe_score-mean  pehe_score-median  pehe_score-std  mean_absolute-mean  \\\n",
       "0         5.633660           2.623297        8.362125            0.732443   \n",
       "1         5.625971           2.635993        8.213626            1.292668   \n",
       "2         5.592356           2.569472        8.248291            0.369939   \n",
       "3         5.493401           2.589651        7.903174            0.655602   \n",
       "\n",
       "   mean_absolute-median  mean_absolute-std             method  train  \n",
       "0              0.238185           1.493276     basic_slearner   True  \n",
       "1              0.396246           2.474603     basic_slearner  False  \n",
       "2              0.212427           0.524395  weighted_slearner   True  \n",
       "3              0.287201           0.941941  weighted_slearner  False  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# For visualization\n",
    "pd.DataFrame(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And because in most cases, we're not changing anything within this loop for the ITE case, `justcause` provides a default implementation. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Standard Evaluation of ITE predictions\n",
    "Using the same list of method callables, we can just call `evaluate_ite` and pass all the information. The default implementation sets up a dataframe for the result following a certain convention. \n",
    "\n",
    "First, there's two columns to define the method for which the results are as well as whether they've been calculated on train or test. Then for all supplied `metrics`, all `formats` will be listed. \n",
    "\n",
    "Standard `metrics` like (PEHE or Mean absolute error) are implemented in `justcause.metrics`. \n",
    "Standard formats used for summarizing the scores over multiple replications are `np.mean, np.median, np.std`, other possibly interesting formats could be *skewness*, *minmax*, *kurtosis*. A method provided as format must take an `axis` parameter, ensuring that it can be applied to the scores dataframe. \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "from justcause.evaluation import evaluate_ite\n",
    "\n",
    "result = evaluate_ite(replications, methods, metrics, train_size=train_size, random_state=random_state)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pehe_score-mean</th>\n",
       "      <th>pehe_score-median</th>\n",
       "      <th>pehe_score-std</th>\n",
       "      <th>mean_absolute-mean</th>\n",
       "      <th>mean_absolute-median</th>\n",
       "      <th>mean_absolute-std</th>\n",
       "      <th>method</th>\n",
       "      <th>train</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.633660</td>\n",
       "      <td>2.623297</td>\n",
       "      <td>8.362125</td>\n",
       "      <td>0.732443</td>\n",
       "      <td>0.238185</td>\n",
       "      <td>1.493276</td>\n",
       "      <td>basic_slearner</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.633660</td>\n",
       "      <td>2.623297</td>\n",
       "      <td>8.362125</td>\n",
       "      <td>0.732443</td>\n",
       "      <td>0.238185</td>\n",
       "      <td>1.493276</td>\n",
       "      <td>basic_slearner</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.592356</td>\n",
       "      <td>2.569472</td>\n",
       "      <td>8.248291</td>\n",
       "      <td>0.369939</td>\n",
       "      <td>0.212427</td>\n",
       "      <td>0.524395</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>5.592356</td>\n",
       "      <td>2.569472</td>\n",
       "      <td>8.248291</td>\n",
       "      <td>0.369939</td>\n",
       "      <td>0.212427</td>\n",
       "      <td>0.524395</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   pehe_score-mean  pehe_score-median  pehe_score-std  mean_absolute-mean  \\\n",
       "0         5.633660           2.623297        8.362125            0.732443   \n",
       "1         5.633660           2.623297        8.362125            0.732443   \n",
       "2         5.592356           2.569472        8.248291            0.369939   \n",
       "3         5.592356           2.569472        8.248291            0.369939   \n",
       "\n",
       "   mean_absolute-median  mean_absolute-std             method  train  \n",
       "0              0.238185           1.493276     basic_slearner   True  \n",
       "1              0.238185           1.493276     basic_slearner  False  \n",
       "2              0.212427           0.524395  weighted_slearner   True  \n",
       "3              0.212427           0.524395  weighted_slearner  False  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Adding standard causal methods to the mix\n",
    "Within `justcause.learners` we've implemented a couple of standard methods that provide a `predict_ite()` method. Instead of going the tedious way like we've done in `weighted_slearner` above, we can just use these methods directly. The default implementation will use a default base learner for all the meta-learners, fit the method on train and predict the ITEs for train and test. \n",
    "\n",
    "By doing so, we can get rid of the `basic_slearner` method above, because it just uses the default setting and procedure for fitting the model. Instead, we just use `SLearner(LinearRegression())`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "from justcause.learners import TLearner, XLearner, RLearner\n",
    "\n",
    "# All in standard configuration\n",
    "methods = [SLearner(LinearRegression()), weighted_slearner, TLearner(), XLearner(), RLearner(LinearRegression())]\n",
    "\n",
    "result = evaluate_ite(replications, methods, metrics, train_size=train_size, random_state=random_state)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pehe_score-mean</th>\n",
       "      <th>pehe_score-median</th>\n",
       "      <th>pehe_score-std</th>\n",
       "      <th>mean_absolute-mean</th>\n",
       "      <th>mean_absolute-median</th>\n",
       "      <th>mean_absolute-std</th>\n",
       "      <th>method</th>\n",
       "      <th>train</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.633660</td>\n",
       "      <td>2.623297</td>\n",
       "      <td>8.362125</td>\n",
       "      <td>0.732443</td>\n",
       "      <td>0.238185</td>\n",
       "      <td>1.493276</td>\n",
       "      <td>SLearner(learner=LinearRegression)</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.633660</td>\n",
       "      <td>2.623297</td>\n",
       "      <td>8.362125</td>\n",
       "      <td>0.732443</td>\n",
       "      <td>0.238185</td>\n",
       "      <td>1.493276</td>\n",
       "      <td>SLearner(learner=LinearRegression)</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.592356</td>\n",
       "      <td>2.569472</td>\n",
       "      <td>8.248291</td>\n",
       "      <td>0.369939</td>\n",
       "      <td>0.212427</td>\n",
       "      <td>0.524395</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>5.592356</td>\n",
       "      <td>2.569472</td>\n",
       "      <td>8.248291</td>\n",
       "      <td>0.369939</td>\n",
       "      <td>0.212427</td>\n",
       "      <td>0.524395</td>\n",
       "      <td>weighted_slearner</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.572626</td>\n",
       "      <td>2.543798</td>\n",
       "      <td>8.213573</td>\n",
       "      <td>0.293187</td>\n",
       "      <td>0.166370</td>\n",
       "      <td>0.428028</td>\n",
       "      <td>TLearner(control=LassoLars, treated=LassoLars)</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5.572626</td>\n",
       "      <td>2.543798</td>\n",
       "      <td>8.213573</td>\n",
       "      <td>0.293187</td>\n",
       "      <td>0.166370</td>\n",
       "      <td>0.428028</td>\n",
       "      <td>TLearner(control=LassoLars, treated=LassoLars)</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>5.579297</td>\n",
       "      <td>2.543798</td>\n",
       "      <td>8.240655</td>\n",
       "      <td>0.289592</td>\n",
       "      <td>0.166370</td>\n",
       "      <td>0.427021</td>\n",
       "      <td>XLearner(outcome_c=LassoLars, outcome_t=LassoL...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>5.579297</td>\n",
       "      <td>2.543798</td>\n",
       "      <td>8.240655</td>\n",
       "      <td>0.289592</td>\n",
       "      <td>0.166370</td>\n",
       "      <td>0.427021</td>\n",
       "      <td>XLearner(outcome_c=LassoLars, outcome_t=LassoL...</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>2.637110</td>\n",
       "      <td>1.277486</td>\n",
       "      <td>3.824333</td>\n",
       "      <td>0.234029</td>\n",
       "      <td>0.196398</td>\n",
       "      <td>0.206225</td>\n",
       "      <td>RLearner(outcome=LinearRegression, effect=Line...</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>2.637110</td>\n",
       "      <td>1.277486</td>\n",
       "      <td>3.824333</td>\n",
       "      <td>0.234029</td>\n",
       "      <td>0.196398</td>\n",
       "      <td>0.206225</td>\n",
       "      <td>RLearner(outcome=LinearRegression, effect=Line...</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   pehe_score-mean  pehe_score-median  pehe_score-std  mean_absolute-mean  \\\n",
       "0         5.633660           2.623297        8.362125            0.732443   \n",
       "1         5.633660           2.623297        8.362125            0.732443   \n",
       "2         5.592356           2.569472        8.248291            0.369939   \n",
       "3         5.592356           2.569472        8.248291            0.369939   \n",
       "4         5.572626           2.543798        8.213573            0.293187   \n",
       "5         5.572626           2.543798        8.213573            0.293187   \n",
       "6         5.579297           2.543798        8.240655            0.289592   \n",
       "7         5.579297           2.543798        8.240655            0.289592   \n",
       "8         2.637110           1.277486        3.824333            0.234029   \n",
       "9         2.637110           1.277486        3.824333            0.234029   \n",
       "\n",
       "   mean_absolute-median  mean_absolute-std  \\\n",
       "0              0.238185           1.493276   \n",
       "1              0.238185           1.493276   \n",
       "2              0.212427           0.524395   \n",
       "3              0.212427           0.524395   \n",
       "4              0.166370           0.428028   \n",
       "5              0.166370           0.428028   \n",
       "6              0.166370           0.427021   \n",
       "7              0.166370           0.427021   \n",
       "8              0.196398           0.206225   \n",
       "9              0.196398           0.206225   \n",
       "\n",
       "                                              method  train  \n",
       "0                 SLearner(learner=LinearRegression)   True  \n",
       "1                 SLearner(learner=LinearRegression)  False  \n",
       "2                                  weighted_slearner   True  \n",
       "3                                  weighted_slearner  False  \n",
       "4     TLearner(control=LassoLars, treated=LassoLars)   True  \n",
       "5     TLearner(control=LassoLars, treated=LassoLars)  False  \n",
       "6  XLearner(outcome_c=LassoLars, outcome_t=LassoL...   True  \n",
       "7  XLearner(outcome_c=LassoLars, outcome_t=LassoL...  False  \n",
       "8  RLearner(outcome=LinearRegression, effect=Line...   True  \n",
       "9  RLearner(outcome=LinearRegression, effect=Line...  False  "
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
